38 research outputs found

    Approximation algorithm for the kinetic robust K-center problem

    Get PDF
    AbstractTwo complications frequently arise in real-world applications, motion and the contamination of data by outliers. We consider a fundamental clustering problem, the k-center problem, within the context of these two issues. We are given a finite point set S of size n and an integer k. In the standard k-center problem, the objective is to compute a set of k center points to minimize the maximum distance from any point of S to its closest center, or equivalently, the smallest radius such that S can be covered by k disks of this radius. In the discrete k-center problem the disk centers are drawn from the points of S, and in the absolute k-center problem the disk centers are unrestricted.We generalize this problem in two ways. First, we assume that points are in continuous motion, and the objective is to maintain a solution over time. Second, we assume that some given robustness parameter 0<t⩽1 is given, and the objective is to compute the smallest radius such that there exist k disks of this radius that cover at least ⌈tn⌉ points of S. We present a kinetic data structure (in the KDS framework) that maintains a (3+ε)-approximation for the robust discrete k-center problem and a (4+ε)-approximation for the robust absolute k-center problem, both under the assumption that k is a constant. We also improve on a previous 8-approximation for the non-robust discrete kinetic k-center problem, for arbitrary k, and show that our data structure achieves a (4+ε)-approximation. All these results hold in any metric space of constant doubling dimension, which includes Euclidean space of constant dimension

    Runaway Feedback Loops in Predictive Policing

    Full text link
    Predictive policing systems are increasingly used to determine how to allocate police across a city in order to best prevent crime. Discovered crime data (e.g., arrest counts) are used to help update the model, and the process is repeated. Such systems have been empirically shown to be susceptible to runaway feedback loops, where police are repeatedly sent back to the same neighborhoods regardless of the true crime rate. In response, we develop a mathematical model of predictive policing that proves why this feedback loop occurs, show empirically that this model exhibits such problems, and demonstrate how to change the inputs to a predictive policing system (in a black-box manner) so the runaway feedback loop does not occur, allowing the true crime rate to be learned. Our results are quantitative: we can establish a link (in our model) between the degree to which runaway feedback causes problems and the disparity in crime rates between areas. Moreover, we can also demonstrate the way in which \emph{reported} incidents of crime (those reported by residents) and \emph{discovered} incidents of crime (i.e. those directly observed by police officers dispatched as a result of the predictive policing algorithm) interact: in brief, while reported incidents can attenuate the degree of runaway feedback, they cannot entirely remove it without the interventions we suggest.Comment: Extended version accepted to the 1st Conference on Fairness, Accountability and Transparency, 2018. Adds further treatment of reported as well as discovered incident

    Fairness in representation: quantifying stereotyping as a representational harm

    Full text link
    While harms of allocation have been increasingly studied as part of the subfield of algorithmic fairness, harms of representation have received considerably less attention. In this paper, we formalize two notions of stereotyping and show how they manifest in later allocative harms within the machine learning pipeline. We also propose mitigation strategies and demonstrate their effectiveness on synthetic datasets.Comment: 9 pages, 6 figures, Siam International Conference on Data Minin

    Realistic Compression of Kinetic Sensor Data

    Get PDF
    We introduce a realistic analysis for a framework for storing and processing kinetic data observed by sensor networks. The massive data sets generated by these networks motivates a significant need for compression. We are interested in the kinetic data generated by a finite set of objects moving through space. Our previously introduced framework and accompanying compression algorithm assumed a given set of sensors, each of which continuously observes these moving objects in its surrounding region. The model relies purely on sensor observations; it allows points to move freely and requires no advance notification of motion plans. Here, we extend the initial theoretical analysis of this framework and compression scheme to a more realistic setting. We extend the current understanding of empirical entropy to introduce definitions for joint empirical entropy, conditional empirical entropy, and empirical independence. We also introduce a notion of limited independence between the outputs of the sensors in the system. We show that, even with this notion of limited independence and in both the statistical and empirical settings, the previously introduced compression algorithm achieves an encoding size on the order of the optimal

    Gaps in Information Access in Social Networks

    Full text link
    The study of influence maximization in social networks has largely ignored disparate effects these algorithms might have on the individuals contained in the social network. Individuals may place a high value on receiving information, e.g. job openings or advertisements for loans. While well-connected individuals at the center of the network are likely to receive the information that is being distributed through the network, poorly connected individuals are systematically less likely to receive the information, producing a gap in access to the information between individuals. In this work, we study how best to spread information in a social network while minimizing this access gap. We propose to use the maximin social welfare function as an objective function, where we maximize the minimum probability of receiving the information under an intervention. We prove that in this setting this welfare function constrains the access gap whereas maximizing the expected number of nodes reached does not. We also investigate the difficulties of using the maximin, and present hardness results and analysis for standard greedy strategies. Finally, we investigate practical ways of optimizing for the maximin, and give empirical evidence that a simple greedy-based strategy works well in practice.Comment: Accepted at The Web Conference 201

    Reducing Access Disparities in Networks using Edge Augmentation

    Full text link
    In social networks, a node's position is a form of \it{social capital}. Better-positioned members not only benefit from (faster) access to diverse information, but innately have more potential influence on information spread. Structural biases often arise from network formation, and can lead to significant disparities in information access based on position. Further, processes such as link recommendation can exacerbate this inequality by relying on network structure to augment connectivity. We argue that one can understand and quantify this social capital through the lens of information flow in the network. We consider the setting where all nodes may be sources of distinct information, and a node's (dis)advantage deems its ability to access all information available on the network. We introduce three new measures of advantage (broadcast, influence, and control), which are quantified in terms of position in the network using \it{access signatures} -- vectors that represent a node's ability to share information. We then consider the problem of improving equity by making interventions to increase the access of the least-advantaged nodes. We argue that edge augmentation is most appropriate for mitigating bias in the network structure, and frame a budgeted intervention problem for maximizing minimum pairwise access. Finally, we propose heuristic strategies for selecting edge augmentations and empirically evaluate their performance on a corpus of real-world social networks. We demonstrate that a small number of interventions significantly increase the broadcast measure of access for the least-advantaged nodes (over 5 times more than random), and also improve the minimum influence. Additional analysis shows that these interventions can also dramatically shrink the gap in advantage between nodes (over \%82) and reduce disparities between their access signatures
    corecore